Goto

Collaborating Authors

 convolution layer


Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Neural Information Processing Systems

Modern convolutional neural networks (CNNs) have massive identical convolution blocks, and, hence, recursive sharing of parameters across these blocks has been proposed to reduce the amount of parameters. However, naive sharing of parameters poses many challenges such as limited representational power and the vanishing/exploding gradients problem of recursively shared parameters. In this paper, we present a recursive convolution block design and training method, in which a recursively shareable part, or a filter basis, is separated and learned while effectively avoiding the vanishing/exploding gradients problem during training. We show that the unwieldy vanishing/exploding gradients problem can be controlled by enforcing the elements of the filter basis orthonormal, and empirically demonstrate that the proposed orthogonality regularization improves the flow of gradients during training. Experimental results on image classification and object detection show that our approach, unlike previous parameter-sharing approaches, does not trade performance to save parameters and consistently outperforms overparameterized counterpart networks. This superior performance demonstrates that the proposed recursive convolution block design and the orthogonality regularization not only prevent performance degradation, but also consistently improve the representation capability while a significant amount of parameters are recursively shared.


Efficient Equivariant Network Supplementary Materials AMNIST-rot Model Architecture

Neural Information Processing Systems

Please refer to Table 5. Table 5: Architecture of E4-Net on Mnist-rot classification, p means dropout rate. The hyperparameters we use in this architecture are kernel size k = 5, reduction ratio r = 1, and the number of slices s = 2. In the large model, we increase the channel dimension to 24, the number of slices to 12, the reduction ratio to 2, and keep other hyperparameters the same. We take ResNet-18 [2], which is composed of an initial convolution layer, followed by 4 stage Res-Blocks and one final classification layer.


Efficient Equivariant Network

Neural Information Processing Systems

Convolutional neural networks (CNNs) have dominated the field of Computer Vision and achieved great success due to their built-in translation equivariance. Group equivariant CNNs (G-CNNs) that incorporate more equivariance can significantly improve the performance of conventional CNNs. However, G-CNNs are faced with two major challenges: spatial-agnostic problem and expensive computational cost. In this work, we propose a general framework of previous equivariant models, which includes G-CNNs and equivariant self-attention layers as special cases.


See More for Scene: Pairwise Consistency Learning for Scene Classification

Neural Information Processing Systems

Scene classification is a valuable classification subtask and has its own characteristics which still needs more in-depth studies. Basically, scene characteristics are distributed over the whole image, which cause the need of "seeing" comprehensive and informative regions. Previous works mainly focus on region discovery and aggregation, while rarely involves the inherent properties of CNN along with its potential ability to satisfy the requirements of scene classification. In this paper, we propose to understand scene images and the scene classification CNN models in terms of the focus area. From this new perspective, we find that large focus area is preferred in scene classification CNN models as a consequence of learning scene characteristics. Meanwhile, the analysis about existing training schemes helps us to understand the effects of focus area, and also raises the question about optimal training method for scene classification.


Log-Polar Space Convolution Layers: Appendix

Neural Information Processing Systems

A.1 Statistics of correlations between different regions and the center pixel We calculate the correlations between image pixels in different log-polar regions and the center pixels on the training set of CIFAR-100. Specifically, for each pixel in each image, we divide its 11 11 neighboring area into different regions by LPSC with 3 distance levels, 8 direction levels, and a growth rate of 2. The center pixels of all areas form the center set. The pixels at the same position of all areas also form a pixel set. For each position, we calculate the correlation score between the corresponding pixel set and the center set. The correlation scores of positions in the same region of all training images are averaged to obtain the correlation score between the region and the center pixel.



1f9f9d8ff75205aa73ec83e543d8b571-Supplemental.pdf

Neural Information Processing Systems

We repeat the theorems presented in Sec. 3 and provide their proofs below. The theorems hold for Neumann boundary conditions, which we use in our implementation--this is achieved by the construction of the differential operators. The proofs follow the ones presented in [22]. If the activation function ฯƒ() is monotonically non-decreasing and sign-preserving, then the forward propagation through the diffusive PDE in (1) for t [0,) yields a non-increasing feature norm, that is, t kfk2 0. Proof. Let us examine the following inner product following Eq.



Synaptic Strength For Convolutional Neural Network

Neural Information Processing Systems

Convolutional Neural Networks(CNNs) are both computation and memory intensive which hindered their deployment in mobile devices. Inspired by the relevant concept in neural science literature, we propose Synaptic Pruning: a data-driven method to prune connections between input and output feature maps with a newly proposed class of parameters called Synaptic Strength. Synaptic Strength is designed to capture the importance of a connection based on the amount of information it transports. Experiment results show the effectiveness of our approach. On CIFAR-10, we prune connections for various CNN models with up to 96%, which results in significant size reduction and computation saving. Further evaluation on ImageNet demonstrates that synaptic pruning is able to discover efficient models which is competitive to state-of-the-art compact CNNs such as MobileNet-V2 and NasNet-Mobile. Our contribution is summarized as following: (1) We introduce Synaptic Strength, a new class of parameters for CNNs to indicate the importance of each connections.